Berkeley CS 61C Lecture 8

Here are some notes and corrections on this lecture:

Each note begins with a time; "ca." in front of a time means that it is approximate.

ca. 10:00 -- NFA is non-deterministic finite automaton. It's a theoretical machine that can be realized in hardware and used to control circuits.

ca. 15:15 -- Assembler directives don't assemble into machine code. They give some instruction to the assembler, which results in some action. Sometimes the action is to add words or bytes to the data segment (.word or ,byte). Or it can simply set some state within the assembler (.data or .text) or tell it to mark a symbol as special in some way (.globl)

ca. 20:10 -- Typo: in the right column "mul" should be "mult".

ca. 22:20 -- left(str) and right(str) are not code; they just give you the idea.

ca. 31:30 -- "branch target" is a technical term. Using two passes in the assembler solves two problems: forward references and the fact that expanding fakes will change the size of the code.

ca. 35:00 -- "such as the la instruction" does not mean that the la instruction is a piece of data in the static data segment; it means that an la instruction can refer to such a thing.

ca. 35:40 -- The file header is like a machine-readable table of contents.

ca. 38:15 -- libc.a not libc.o

ca. 40:15 ff. -- The full name is "relocating linker"; it does two things: it relocates, and it links. The assembler bases every .o file at zero, so relocating is obviously necessary since we can't have them overlapping each other. Why zero? Well, the assembler has no idea where the code is going to end up after the linker gets through with it, so zero is a good as any; it's actually better since it makes relocating easier for the linker.

ca. 41:00 -- Branch instructions, unlike jumps, will never have a target outside the file they are in.

ca. 42:00 -- $gp speeds up references into the static data segment, but we don't care about speed, and it is a complication. So we'll ignore it.

ca. 43:12 -- "need to know how big text is ..." is not true since static data always begins at 0x10000000.

ca. 1.02:50 -- MARS does not use s.str and r.str; it uses absolute addresses instead. cf. 22.20

ca. 1.03:30 -- Note that main: is at 0x00000000. cf. 40:15

ca. 1.11:30 -- "includes entire library ..." (on slide) is NOT true. The linker will pick out just what it needs to satisfy all the references that it can't find anywhere else.

This collection of programs (assembler, linker, and perhaps others) is called a toolchain.